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Abstract 

The New Millennium Remote Agent (NMRA) will be the 
first AI system to control an actual spacecraft. The space- 
craft domain places a strong premium on autonomy and 
requires dynamic recoveries and robust concurrent execu- 
tion, all in the presence of tight real-time deadlines, changing 
goals, scarce resource constraints, and a wide variety of pos- 
sible failures. To achieve this level of execution robustness, 
we have integrated a procedural executive based on generic 
procedures with a deductive model-based executive. A pro- 
cedural executive provides sophisticated control constructs 
such as loops, parallel activity, locks, and synchronization 
which are used for robust schedule execution, hierarchical 
task decomposition, and routine configuration management. 
A deductive executive provides algorithms for sophisticated 
state inference and optimal failure recover}' planning. The 
integrated executive enables designers to code knowledge via 
a combination of procedures and declarative models, yield- 
ing a rich modeling capability suitable to the challenges of 
real spacecraft control. The interface between the two ex- 
ecutives ensures both that recovery sequences are smoothly 
merged into high-level schedule execution and that a high 
degree of reactivity is retained to effectively handle addi- 
tional failures during recovery. 

1 Introduction 

We are developing the first on-board AI system to control an 
actual spacecraft (Bernard et ai 1998). The mission, Deep 
Space One (DS-1), is the first in NASA’s New Millennium 
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Program (NMP), an aggressive series of technology demon- 
strations intended to push Space Exploration into the 21st 
century. DS-1 will launch in mid- 1998 and will navigate 
by near-Earth asteroid 3352 McAuliffe, Mars, and comet 
West-Kohoutek-Ikemura, taking pictures and sending back 
information to scientists on Earth. One key technology to 
be demonstrated is spacecraft autonomy, including robust 
plan execution (Pell et ai 1997b). Since aborting a plan 
and taking time to re-plan can cause the spacecraft to miss 
critical mission activities, execution of plans must be highly 
robust. Hence, the execution system must maintain space- 
craft safety and successfully execute the plan, even in the 
presence of hardware faults and other unexpected events. 

This work is being implemented as part of the New' Mil- 
lennium Remote Agent (NMRA) architecture (Pell et ai 
1997a). This architecture integrates traditional real-time 
monitoring and control with constraint-based planning and 
scheduling (Muscettola 1994), robust multi-threaded execu- 
tion (Gat 1996), and model-based diagnosis and reconfigu- 
ration (Williams & Nayak 1996; 1997). 

Pell et al. (1997b) describes the approach we have taken 
to the automatic generation of robust plans, which incorpo- 
rate flexibility to be used by the execution system in case 
problems or opportunities arise during execution. This pa- 
per fc-cuses on the execution system itself. In particular, we 
found it necessary to develop a hybrid procedural and deduc- 
tive executive in order to achieve the high levels of reliability 
required in the autonomous spacecraft domain. A procedu- 
ral executive provides sophisticated control constructs such 
as loops, parallel activity, locks, and synchronization which 
are used for robust schedule execution, hierarchical task de- 
composition, and routine configuration management. A de- 
ductive executive provides algorithms for sophisticated state 
inference and optimal failure recovery planning. The inte- 
grated executive enables designers to code knowledge via a 
combination of procedures and declarative models, yielding 
a rich modeling capability suitable to the challenges of real 
spacecraft control. The interface between the two executives 
ensures both that recovery sequences are smoothly merged 
into high-level schedule execution and that a high degree of 
reactivity is retained to effectively handle additional failures 
during recovery. 

This paper discusses our domain, the component execu- 
tion technologies, and the approach w f e took to integrating 
these technologies into a hybrid executive that supports the 
strengths of each while minimizing potentially negative in- 
teractions between the two systems. The paper is organized 
as follows. Section 2 discusses the spacecraft domain and re- 
quirements which influence our design. Section 3 describes 


our problem and hybrid approach to execution systems. Sec- 
tion 4 describes the capabilities in our procedural executive. 
Section 5 addresses the capabilities in the deductive execu- 
tive. Section 6 shows how we have integrated the two sys- 
tems. Section 7 discusses some key points about our design. 
We then consider related work and conclude. 

2 Domain and Requirements 

The spacecraft domain presents a number of challenges for 
robust plan execution. 

2.1 High Reliability 

A central requirement of spacecraft operation is high relia- 
bility , since spacecraft are expensive and often unique. Part 
of this high reliability is achieved through the use of reli- 
able hardware. However, the harsh environment of space or 
the inability to test in all flight conditions can still cause 
unexpected hardware failures. When hardware failures or 
unexpected flight conditions do occur, the software system 
is required to compensate for such contingencies when possi- 
ble. This requirement dictates the use of an executive with 
elaborate system-level fault protection capabilities. Such 
an executive can rapidly react to contingencies by retry- 
ing failed actions, reconfiguring spacecraft subsystems, or 
putting the spacecraft into a safe state to prevent further, 
potentially irretrievable, damage. 

2.2 Concurrent Temporal Processes 

Many devices and systems must be controlled, leading to 
multiple threads of complex activity. These concurrent pro- 
cesses must be coordinated to control for interactions, such 
as vibrations of the thruster system violating stability re- 
quirements of the camera. Also, activities may have precise 
real-time constraints, such as taking a picture of an asteroid 
during a short time period of observability. 

2.3 Interacting Recoveries 

A particularly challenging problem in the design of a space- 
craft fault protection system arises from the combination of 
the above two properties: recovering failed activities in the 
presence of concurrent activity. As an example, consider two 
spacecraft subsystems in DS-1: the engine gimbal (EG) and 
the solar panel gimbal (SPG). A gimbal is part of a physical 
system that enables it to rotate. For example, the engine 
nozzle can be rotated to point in various directions without 
changing the spacecraft orientation, and the solar panels can 
be independently rotated to track the sun. In DS-1, both 
sets of gimbals communicate with the main computer via a 
shared board called the gimbal drive electronics (GDE). If 
either system experiences a communications failure, one way 
to reset the system is to power-cycle (turn on and off) the 
GDE. However, resetting the GDE to fix one system also re- 
sets the communication to the other system. In particular, 
resetting the engine gimbal, to fix an engine problem, causes 
temporary loss of control of the solar panels. Thus, fixing 
one problem can cause new problems to arise. To avoid this, 
the recovery system needs to take into account global con- 
straints from nominal schedule execution, rather than just 
making local fixes in an incremental fashion. Examples like 
this drove the design of our hybrid execution system. 


3 Approach 

In this section we first describe the problem we faced, and 
then our approach to solving it. 

3.1 The Problem 

Complex execution of spacecraft plans requires capabilities 
of both procedural and declarative execution systems. 

On the one hand, execution requires reactivity, time- 
sensitivity, and sophisticated control constructs such as loops, 
parallel activity, locks, and synchronization. The standard 
approach to this is to build executives which interpret direc- 
tives in a rich procedural language, make fast choices based 
on contextual knowledge, and choose alternatives when pre- 
vious choices fail (Firby 1978). 

However, this strict procedural approach has its limita- 
tions — it is hard to procedurally encode optimal choices in 
all, possibly degraded, situations. Specifically, execution re- 
quires choosing component configurations with different ca- 
pabilities and costs. Similarly, robust recovery may require 
novel combinations of actions in order to trade off costs and 
benefits. For example, the propulsion system on the Cassini 
spacecraft (Brown, Bernard, & Rasmussen 1995) has a com- 
plex set of valves, including explosive pyro valves which can 
change states only once, and ordinary valves with varying 
amounts of wear and tear. It is difficult to procedurally 
express the right valve choices to redirect fluid flow while 
minimizing costs and risks in all possible situations. 

On the other hand, a deductive executive of the form de- 
veloped by Williams &: Nayak (1996) can reason efficiently 
about such tradeoffs using declarative models of the costs 
and benefits of configurations and recoveries. Furthermore, 
the compositional nature of such models allows compact rep- 
resentations of the costs and benefits of each possible choice. 
Finally, deductive executives have sophisticated state in- 
ference algorithms, supporting the identification of hidden 
state, failed sensors, and multiple faults. However, declara- 
tive models can lack the flexibility and richness of activity 
description found in procedural execution systems. For ex- 
ample, the Livingstone system (Williams Nayak 1996) is 
based on a propositional temporal logic which does not ex- 
plicitly model metric time or execution loops. Thus it is 
hard to encode knowledge like: 

To send a signal down to earth via an antenna, 
first turn off the antenna’s exciter, then turn on 
the antenna’s power supply, wait 5 seconds, and 
turn the exciter on again. 1 

3.2 Hybrid Approach 

From this we see that the procedural and deductive ap- 
proaches to execution have complementary strengths and 
weakness. Hence, our approach is to develop a hybrid exec- 
utive, as follows: 

• Use a procedural executive for timing, control knowl- 
edge, schedule execution, hierarchical task decomposi- 
tion, and routine configuration handling. 

• Use a deductive executive for state inference, novel 
responses based on global context, and cost /benefit 
analysis. 

J The reason for this requirement is that turning on the power sup- 
ply sends a surge of power which would destroy the sensitive exciter. 
Hence the exciter should be switched off while the surge is happening, 
and then switched on again. 


• Work out clear interfaces between the two systems to 
exploit the strengths of each. 

Note that some divisions are arbitrary, since certain ca- 
pabilities exist in both systems. This gives the designer flex- 
ibility to choose the best system and language for specific 
purposes. For example, while routine configuration manage- 
ment can be handled either procedurally or declaratively, we 
have chosen to handle it procedurally. In our treatment, the 
procedural executive draws on the planning capability of the 
deductive executive by using it as a recovery expert , sending 
it a set of global constraints that ensure that the resulting 
recovery plans can be integrated within the current execu- 
tion context. 

In the next sections we describe the procedural exec- 
utive and the deductive executive. To reflect their roles 
in the NMRA, we will often refer to the procedural execu- 
tive as Exec and the deductive executive as MIR , the mode- 
identification and reconfiguration system. 

4 Procedural Executive 

Our procedural executive is based upon a sophisticated script- 
ing language called Executive Support Language, (ESL) (Gat 
1996), for describing control constructs necessary for exe- 
cution. Such constructs manage concepts of time, events, 
multiple methods, class hierarchies, and generic procedures. 
Some of these constructs axe summarized later in this sec- 
tion. An executive also needs a source of state update knowl- 
edge. In NMRA, Exec benefits from being insulated from 
the hardware details by relying on the results of the mode 
identification (MI) component of MIR (see Section 5). 


Spacecraft 



Figure 1: Procedural Executive Resource Manager 

The executive manages a set of concurrent control tasks, 
as shown in Figure 1. Each control task requires a set of 
resources , or properties , to be established and maintained 
over some period of time. For example, the activity of tak- 
ing pictures with a camera requires that the camera is on 
and functional. If some other activity requires the camera 
to be off, these two activities compete for the resource of 
controlling the camera’s power state. The executive must 


achie ve, maintain, and monitor properties required for each 
task, and resolve task resource conflicts. 

A task is represented at run-time by an independent ex- 
ecution thread. Threads communicate with other theads di- 
rectly via signals , or indirectly via changes to a database. 
Receipt of a signal or notification of a change to the database 
are examples of events. 

Each activity uses the (vith-maintained-properties) 
construct to declare those properties that it requires main- 
tained over its interval of execution. In this way, Exec un- 
derstands the constraints which support the entire current 
execution context. When a property is achieved and re- 
served for a task, it is said to be locked until the task re- 
linquishes it, so that other tasks will not be permitted to 
violate that property. Of course, the locks reflect properties 
true in the current state, and sometimes these properties 
can change despite the best efforts of the software system to 
maintain them. For example, switches on a spacecraft some- 
times change state accidentally. In this case, we describe the 
properties as lost or violated , and the tasks requiring them 
as unsupported 2 

In the event that some property is lost or otherwise un- 
achievable without the help of a recovery expert, Exec sus- 
pends the unsupported threads, formulates a query based on 
the active constraints, and uses the automat ic-recoveries 
thread to send the query off to the recovery expert (in this 
case, MIR). 

When the recovery expert returns an action, Exec per- 
forms the action and then re-activates any suspended threads 
which may now be supported. The threads then attempt to 
re-establish their maintained conditions. Note that most 
Exec procedures count the number of times they have re- 
tried a particular approach, and try something else or give 
up if this retry counter exceeds a threshold. 

The automatic-recoveries thread remains in action for- 
ever, so unsatisfied constraints following execution of some 
recovery step will lead to a new recovery request. 

We now elaborate on some of the key constructs we have 
developed within the procedural executive that support the 
behavior described above. 

4.1 Achieving properties 

(achieve <property>) 

• If this is the first thread to request the property, then 
execute an achievement method for the property. 

• When achievement is successful, signal other waiting 
threads. 

• If some other thread is already achieving the property, 
then wait for it to finish. 

• If the property is inconsistent with a current lock, ei- 
ther wait for lock to be released or fail immediately 
(based on preferences set by the invoking thread). 

4.2 Maintained Properties 

(vith-maintained-properties <properties> body) 

2 Note that property locks can serve a role similar to typical locks in 
multi-threaded systems, such as semaphores and mutexes. However, 
there is a major difference since these property locks are database- 
relative, and can hence be “taken” by the outside world changing 
Note also that naive use of property locks can result in deadlock, just 
as occurs with standard locks in multi-threaded operating systems. 





• If properties are all currently true, body is executed. 

• If properties are false, the executive tries to achieve 
them first. 

• Once they are true, the executive locks the properties 
and executes body. 

• If the properties become false during execution of body , 
signal this loss and let the enclosing context of body 
choose the response. 

4.3 Device Management Idioms 

Devices and classes are formalized using generic descrip- 
tions. Individual devices, switches, etc., axe then modeled 
as instances of these classes. 

(def ine-device-class : camera 

: power-function # y f sc -power-request 
: talk-function # 1 camera-talk-msg) 

(def ine-device : camera.A : camera 
: powered-thru :power_bus_l 
: switched-thru : f sc_camera_swl 
:ready-state ( ( : health^state :ok) 

( :power_state : on) ) ) 

Based on these device idioms, we have defined generic 
procedures for device configuration and management: 

(vith-selected-device <class> 

(do-activity) ) 

This construct selects a device of the class, achieves its 
ready-state, and then locks the properties of that ready- 
state and maintains them as it executes the enclosed activity. 
Based on the camera definition above, 

(with-selected-device : camera (take-pictures)) 

would select a camera (say camera^ A), achieve its ready 
state of being powered on and healthy, and then take pic- 
tures within a context that ensures that the health and 
power of the camera are maintained throughout picture tak- 
ing. 

4.4 Recovering failed properties 

In the case where a maintained property is lost (for exam- 
ple, device switch flips off unexpectedly or the engine per- 
forms an automatic shutdown), the enclosing context of the 
(with-maintained-properties) form determines the ap- 
propriate response. If no response is defined for the enclosing 
context, then the form fails. 

(with-automatic-recoveries body ) 

This form indicates that the response to lost properties 
within body is to suspend the thread while waiting for an 
automatic recovery, and then retry the body. Note that this 
is only one way to create an enclosing context to handle 
the lost properties notification. For example, a thread could 
establish its own local recovery expert, or decide to try al- 
ternative methods if properties are lost, rather than waiting 
for a automatically generated recovery. 


4.4.1 Automatic Recoveries Thread 

A special thread in the executive manages the property 
locks. Whenever some property lock is violated: 

1. Suspend all tasks who have a violated lock. 

2. Ask for an automatic recovery for all required Iocks. 

3. Wait for a recovery action to be generated in response 
to this query. 

4. Execute the recovery action. 

5. Signal recovery-event. 

The effect of signaling recovery-event is to wake up all 
threads who were suspended waiting for a property which 
was restored (possibly as a result of the recovery action). 
Each awakened thread then retries the body, attempting to 
re-establish all their required properties. 

For properties which were restored by the recovery ac- 
tion, this will succeed. For properties which are still failed, 
the affected threads will block again, and wait for another 
recovery step. 

If the automatic-recoveries thread fails to return with 
a recovery action while some threads are blocking on re- 
quired properties, the waiting tasks fail automatically. This 
can happen either when the recovery expert believes no fur- 
ther actions need be achieved, or when it fails to find a 
solution to the recovery request. 

5 Deductive Executive 

The deductive executive can be viewed as a discrete model- 
based controller that attempts to keep the spacecraft state 
on a trajectory that achieves a set of high-level input prop- 
erties (analogous to the set-point of a continuous controller). 
In the NMRA architecture, the dedective executive is also 
referred to as MIR reflecting that control is achieved through 
mode identification (the sensing component) and mode re- 
configuration (the actuation component). 

MIR is model-based in the sense that it uses a single 
declarative, compositional model of the spacecraft to sup- 
port all of its capabilities. MIR views each component as 
a finite state machine, and the entire spacecraft as concur- 
rent, synchronous state machines. Nodes in the graph rep- 
resent behavioral modes , and axes represent possible transi- 
tions among modes, some exogenous, some commandable. 
Modes partition the state space of the component, and are 
specified using well-formed formulae in a propositional lan- 
guage. 

Mode identification (MI) involves tracking the most likely 
trajectory of the spacecraft state by observing all commands 
that are sent to the spacecraft and monitoring information 
from spacecraft sensors. Each point in a trajectory con- 
sists of the current behavioral mode of each component in 
the spacecraft. Components include both hardware devices 
and lower-level software modules. With modes identified, 
more detailed component state information is available at 
the propositional level. 

MI provides a service for tracking and reporting state 
changes to external software modules as they occur. The 
idea is that external modules will typically be interested 
only in higher-level properties (and corresponding higher- 
level events) involving spaceraft state, rather than the finer 
grained view available to MI. These abstract properties are 
naturally defined as well-formed formulae, and are easily 


tracked using Mi’s inference capabilities. In the NMRA ar- 
chitecture, Mi’s state update service is an integral part of 
the interface between MIR and Exec. 

Mode reconfiguration (MR) involves generating a sequence 
of actions that moves the spacecraft from its most likely cur- 
rent state to a new state that achieves a desired set of prop- 
erties. MR is comprised of two stages. First, the requested 
set of properties to be achieved is used to generate a specific 
goal state for each of the spacecraft’s components. Second, 
a sequence of actions that move the spacecraft from the cur- 
rent state to the goal state is incrementally generated. We 
refer to this second stage as model-based reactive planning 
(MRP). The sequence may be empty meaning that no ac- 
tion is necessary, or sequence generation may fail meaning 
that no reconfiguration plan could be found. Each action in 
the sequence is a primitive operator from the perspective of 
MIR’s models. When MIR functions as a stand-alone de- 
ductive executive, each primitive operator corresponds to a 
command directly executable by an external software mod- 
ule. In the NMRA architecture, Exec specifies the desired 
properties of the goal state and primitive operators in the 
action sequence are bound to Exec procedures. 

MIR uses algorithms adapted and extended from model- 
based diagnosis (de Kleer &c Williams 1987; 1989) to provide 
the above functionality. The main idea behind model-based 
diagnosis is to identify the set of possible component states 
in a system given models and observations of each compo- 
nent in the system. In many systems, especially spacecraft, 
there may be inadequate information in the models and ob- 
servations to uniquely identify every component’s state at all 
times. The approach is thus to select the most likely compo- 
nent configuration from amongst those that are consistent 
with the models and observations. 

The primary workhorse in the deductive executive is an 
extremely efficient conflict-directed best-first search algo- 
rithm (Williams &: Navak 1996). The algorithm is exploited 
by MI to identify the most likely component configuration 
consistent with models and observations, and by MR to se- 
lect a specific goal state having a specified set of properties. 
Additionally, a recent approach to MRP (Williams & Nayak 
1997) exploits the algorithm at compile time to compile away 
irrelevant information in system models in support of effi- 
cient planning. Such reuse of algorithms and system models 
across MIR’s capabilities is a signature of the model-based 
approach, and greatly simplifies the development and main- 
tenance of our deductive executive. 

6 Integration 

Having described the procedural and deductive executives, 
we now discuss how we combined these systems in the NMRA 
architecture to form an integrated hybrid executive. Recall 
that we exploit the procedural executive (Exec) for sched- 
ule execution, hierarchical task decomposition, and routine 
configuration management, while the deductive executive 
(MIR) is used both for state inference and failure response. 

Here we make explicit that the communication inter- 
face between Exec and MIR consists of the following: state 
updates from MIR to Exec, recovery requests from Exec 
to MIR, and recovery actions from MIR to Exec. Both 
state updates and recovery requests are represented as well- 
formed formulae in a propositional language shared between 
Exec and MIR. Recovery actions axe instantiations of Exec’s 
generic procedures. 

To support state updates, MIR continually tracks the 
most likely state of the spacecraft and informs Exec of changes 


to any higher-level property it wants tracked. Exec uses this 
state information to make task decomposition and configu- 
ration management decisions, and to determine the truth 
of properties needed by various threads of execution. Exec 
procedures are generally written to exploit MIR by allowing 
it to perform most inferences about spacecraft state that 
may be required. The properties to be tracked for Exec by 
MIR are agreed upon at compile time, but we note that the 
interface can be extended naturally to allow the notion of 
registering tracked properties on the fly; such run-time flex- 
ibility would allow for more efficient communication during 
critical mission phases, and enable Exec activities to dy- 
namically declare their own interface with MIR to improve 
modularity. 

Exec also views MIR as a recovery expert. As events 
occur in Exec’s schedule, it provides MIR with the current 
set of properties that must be maintained to support all 
active threads. At the time of invocation, some of these 
properties will be true and some may be false. Using its 
declarative models and knowledge of the current state, MIR 
generates an action sequence that is expected to move the 
spacecraft to a goal state in which all the requested prop- 
erties are achieved. MIR provides the first action in this 
sequence to Exec. Exec then executes this action and waits 
for state updates from MIR to determine the status of its 
required properties. The recovery interaction repeats with 
MIR until either all desired properties are achieved or MIR 
indicates that it can find no sequence to achieve those prop- 
erties. 

Three points are worth noting about the recovery in- 
terface. First, note that MIR sends only the first action 
in the recovery sequence. This improves the reactivity of 
the hybrid executive in two ways: Exec is free to make finer 
grained recovery requests to reflect any changes in the status 
of schedule execution since the previous request, while MIR 
is free to factor any asynchronous spacecraft state changes 
that may have occurred into its next recovery plan. Achiev- 
ing this level of reactivity would be somewhat more difficult 
if the Exec were expected to robustly execute a full plan re- 
turned from MIR, for either the plan would then have to be 
much larger to reflect all contingencies or Exec would have 
to encode the robustness into the primitive procedures over 
which MIR reasons. 

Second, treating recovery actions as instances of generic 
procedures fully exploits the representational strengths of 
both systems. In practice, a natural modeling approach that 
addressed both representional convenience and efficiency was 
to encapsulate all issues related to metric time and iteration 
inside Exec’s procedural constructs. This was natural, for 
instance, in the case of the downlink example provided in 
Section 3. 

Third, note that when used as a stand-alone configura- 
tion system, MIR is free to generate any sequence of actions 
resulting in a state with the requested properties. However, 
as part of the hybrid executive, properties requested dur- 
ing recovery are viewed as constraints on the entire recovery 
plan, not just the goal state; this means that MIR must not 
generate a recovery plan that is expected to deviate from 
a requested property. Depending on the approach to MRP 
that one adopts, this places additional computational re- 
quirements on the reactive planner that may require one to 
give up optimality or efficiency guarantees; this is indeed 
the case for the approach used in (Williams Nayak 1997), 
for example. Combined with the requirement on Exec to 
include all required properties as part of a recovery request, 
this restriction on MIR ensures that recovery sequences are 



smoothly merged into nominal schedule execution, resolving 
the problems of resource preemption and interacting recov- 
eries discussed in Section 2. 

7 Discussion and Future Work 

In this section we discuss ongoing issues and limitations in 
our current hybrid executive and indicate future work. 

7.1 Compositionality and Modularity 

A major design goal within the NMRA is to develop modu- 
lar, compositional representations of spacecraft subsystems. 
A subtle limitation violating this goal exists in our current 
recovery framework; it arises in the context of multiple fail- 
ures, even when they occur in otherwise independent sub- 
systems. 

Consider two independent subsystems, managed separ- 
ately by two Exec activities. Suppose one subsystem can be 
recovered if it fails, and the other cannot. In the event of 
independent failures in each subsystem, the recovery frame- 
work would procede through two seperate recovery attempts 
and result predictably in the recovery of one subsystem. 
However, should those same failures instead occur in suf- 
ficiently close temporal proximity, MIR would report the 
failures to the Exec simultaneously. The Exec would then 
form a recovery request to MIR asking for the recovery of 
the conjunction of the two failed properties of interest. MIR 
would then be forced to report that no such recovery is pos- 
sible (since only one of the properties is recoverable) and the 
Exec activities managing the independent subsystems would 
both fail, resulting in the recovery of neither subsystem. 

The standard response to this problem is to emphasize 
that this limitation only arises in the case of simultaneous, 
independent failures. For most missions, such events axe 
deemed sufficiently unlikely that they are considered accept- 
able risks and beyond the scope of current fault protection 
systems. It is worth noting that this risk assessment is based 
in part on another limitation of current fault protection 
frameworks: the mindset within the spacecraft community 
is that unlikely hardware failures are less likely than a de- 
sign flaw in a complex fault protection system that attempts 
to support these unlikely failures. Our methods aim to ad- 
dress this general concern first and foremost by simplifying 
the design of robust execution systems to enable broader 
fault coverage. We view modularity and compositionality 
as key requirements of a simple design. 

The solution is to augment the recovery framework to en- 
able consideration of partial recoveries, rather than attempt- 
ing an all-or-nothing recovery. The open design issue is to 
understand whether this is best accomplished with modi- 
fications to Exec or MIR, In the former case, Exec could 
be augmented to formulate a series of independent partial 
recovery requests that would collectively achieve total fea- 
sible recovery, giving priority to the most urgent activities. 
The intuition here is to have Exec be more clever in asking 
for only what it needs, though this would currently require 
access to system models stored in MIR. Alternatively, MIR 
could generate recovery plans that satisfy a maximal subset 
of the requested properties, though in practice this would 
require additional communication between Exec and MIR 
to allow Exec to specify its preferences. These approaches 
are complementary, and striking a proper balance between 
them is an area of ongoing research. 


7.2 Heterogeneous Knowledge Representation 

A strength of our hybrid executive system is that we can 
represent execution and repair knowledge in a procedural 
way, a declarative way, or a combination thereof, depending 
on the situation. This has proven to be useful in our domain. 
On the flip-side, this approach can lead to a fair amount of 
duplicated knowledge between Exec and MIR. We are cur- 
rently developing an approach to permit maximal sharing 
of domain models across the two systems that still affords 
the representational power and convenience of our hybrid 
approach. Note that this sharing of system models also sup- 
ports the partial recovery issue addressed above by enabling 
Exec to access system models during formulation of partial 
recovery sequences. 

7.3 Dealing with Uncertainty 

Ambiguity management is a critical issue in spacecraft oper- 
ations, primarily due to limitations in the number and type 
of onboard sensors and the possibility of sensor failures. Re- 
call that MIR currently tracks only the most likely trajec- 
tory of the spacecraft, a restriction driven primarily by the 
severely limited onboard computation available to it (10% 
of a 20MHz CPU on DS-1). MIR deals with ambiguity by 
assuming a worst-case scenario. For example, if there is am- 
biguity as to whether a device has failed or a communication 
path to that device has failed, MIR assumes that both have 
failed. Although this construction of a worst-case trajectory 
works well in the case of the DS-1 models, one can construct 
models for which the worst-case scenario leads to subopti- 
mal recoveries and unsound conclusions. We are working an 
approach that allows MIR to track a small set of the most 
likely trajectories to deal more cleanly with ambiguity in an 
efficient manner. 

Recall further that MI exports to Exec only the most 
likely state of the world. Exec acts as if this state were 
the true state and responds quickly in the face of new in- 
formation. Hence, Exec obeys the rapid feedback principle 
discussed by Schoppers (1995), and so is more likely to re- 
main robust in the face of its unmodeled uncertainty. How r - 
ever, the lack of explicit communication of uncertainty and 
ambiguity between MI and Exec makes it difficult to write 
ambiguity resolution procedures in the Exec. At present, 
such procedures must be either hard-wired in the code (e.g., 
do a calibration experiment before thrusting the engine) or 
accessed exclusively via the interface with MR. We are pur- 
suing an approach to active testing wherein Exec and MIR 
cooperate to synthesize optimal sequences from system mod- 
els that resolve ambiguity in a manner that preserves space- 
craft safety and non-renewable resources. 

8 Related Work 

This paper has described the integration of procedural and 
deductive capabilities within a hybrid executive. This sec- 
tion discusses related work and addresses procedural reason- 
ing systems that provide support for deduction, deductive 
reasoning systems that provide support for reaction, hybrid 
action description languages, and systems that cleanly sep- 
arate a deductive planning or inference component from a 
procedural execution component. 

Like our Exec, RAPS (Firby 1978) is centered around 
procedural reasoning, but provides language features to ex- 
press deductive state inference (in the form of memory-rules) 
and to incorporate the results of deductive problem-solving 


systems (in the form of problem-solvers). RAPS also pro- 
vides constructs to indicate resource locks for thread syn- 
chronization, but these constructs are used only at the low- 
est level of the system. 

PRS (Georgeff Sc Lansky 1987) is also similar to our 
Exec in that it provides a language based around proce- 
dural reasoning and it has been applied to support diagno- 
sis (Georgeff Sc Lansky 1986) and plan execution (Georgeff, 
Lansky, Sc Schoppers 1987). PRS also provides support for 
procedures to perform meta-level reasoning about execution 
context (Ingrand Sc Georgeff 1990) and some constructs to 
express resource usage to prevent harmful task interactions 
(e.g., the require construct). 

Our hybrid executive extends the capabilities of these 
systems (and similar procedural reasoners such as RPL (Mc- 
Dermott 1993) and APEX (Freed Sc Remington 1997)) in 
two major ways. The first is to provide explicit access to de- 
ductive model-based reasoning for diagnosis and planning. 
The second is to extend resource locks into a task-level con- 
struct and to provide a way to use them to constrain the 
results of deductive inference. 

While Exec, RAPS and PRS may be viewed as procedu- 
ral reasoning systems with deductive attachments, a large 
body of work in automated reasoning has focused on deduc- 
tive reasoning systems with procedural attachments (Gene- 
sereth Sc Nilsson 1987). Most of this work focuses on using 
procedures to support inference, rather than on describing 
action in a dynamic environment. However, researchers have 
recently begun exploiting the ability to view logical systems 
like Prolog (Clocksin Sc Mellish 1981) through both an op- 
erational and a denotational semantics to create logical de- 
scriptions of procedures which can support both procedural 
and deductive reasoning in the presence of a changing en- 
vironment. Example systems include Golog (Levesque et 
al. 1997; de Giacomo, Lesperance, Sc Levesque 1997) and 
InterRAP (Muller Sc Pischel 1994). 

Estlin, Chien, Sc Wang (1997) describe a hybrid approach 
to action descriptions for planning systems that integrates 
Hierarchical Task Network (HTN) planning, which can be 
viewed as a procedural representation, with operator-based 
planning, which deduces action sequences from first princi- 
ples. Also related is the OSCAR architecture (Pollock 1998), 
which integrates planning and reasoning activities within a 
general-purpose defeasible reasoner. 

Perhaps the most typical approach to developing a hy- 
brid system is to develop separate components for both 
styles of reasoning and define a clear interface to support 
the interaction. Much of the research on integrating plan- 
ning and execution (Wilkins et al. 1995; Bonasso et al 1997; 
Hayes-Roth 1995; Simmons 1990; Currie Sc Tate 1991; Pell 
et al. 1998, for example) takes this approach. Whereas 
these systems generally treat the planner and executive as 
functioning on widely different time-frames, our approach 
exploits fast deduction to provide these capabilities within 
the reactive execution loop itself. 

In terms of separate components for procedural execu- 
tion and deductive state inference (as opposed to planning), 
Ogasawara (1991) describes a hybrid architecture based on 
Bayesian networks and decision-theory for state inference, 
where the results of inference can be used by a system ex- 
ecuting high-level procedures. The Touring Machine archi- 
tecture (Ferguson 1992) also provides a separate capability 
for deductive world modeling that informs the activities of 
a procedural executive. 


9 Conclusion 

This paper has described the integration of procedural and 
deductive capabilities within a hybrid executive. While there 
has been much research on integrating planning or state 
inference with execution and on incorporating procedures 
within deductive systems or vice-versa, comparatively lit- 
tle work has attempted to do so within a fast reactive loop 
or in the presence of concurrent activities. In addressing 
such an integration, we found we had to constrain or mod- 
ify the component systems to address a number of technical 
problems. These problems included resource preemption, 
interacting concurrent recoveries, and non-compositionality 
of independent recoveries. The hybrid executive we have de- 
veloped addresses all these issues to some extent, and per- 
mits an extremely flexible and powerful representation of 
knowledge while still remaining robust and reactive. 

Now that we have this flexibility, a major challenge re- 
mains to understand how to take most advantage of it. Key 
issues include the following: 

• Understanding the tradeoffs between knowledge repre- 
sentations that are procedural, declarative, or hybrid. 

• How to ensure consistency of knowledge across hetero- 
geneous representations. 

• Developing robust approaches to active sensing and 
active diagnosis within a hybrid executive. 

• More integrated approaches to uncertainty manage- 
ment. 

Lastly, it should be noted that our hybrid approach has 
evolved considerably over the last few years, based on lessons 
in the real spacecraft domain. We have now developed hy- 
brids between Livingstone (Williams Sc Nayak 1996) and two 
different procedural execution systems: ESL (Gat 1996) and 
RAPS (Firby 1978). On the basis of this, we hope that our 
approach will be useful for integrating a wide variety of pro- 
cedural and deductive executives. However, we found the 
explicit support for language extensions in ESL to be ex- 
tremely useful for developing the new language constructs 
which enabled the strong integration discussed in this paper. 
This suggests language extension capabilities will make the 
job easier for other attempts to do a similar integration. 
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